Skip to content

Conversation

@turmelclem
Copy link
Collaborator

@turmelclem turmelclem commented Oct 13, 2025

Content

This PR includes a rework of the retrieval of the network configuration parameters in the mithril-aggregator in order to prepare their decentralization.

Details

Mithril-aggregator

Changes on both leader and follower:

  • Rework EpochService to use the MithrilNetworkConfigurationProvider system from internal/mithril-protocol-config
    • inform_epoch now fetch its current/next/registration AggregatorEpochSettings from the configuration provider
    • rework how future epoch settings are inserted in db : instead of doing it at an offset of +2 in a update_epoch_settings that is run by the state machine after running inform_new_epoch, it's now done within inform_epoch right after retrieving the parameters from the configuration provider and at an offset of +1. This means that the parameters are now stored at the last moment right before their usage instead of ahead of time.
  • Epoch setting store: use an insert or ignore instead of an insert or replace when storing the epoch settings. Now stored epoch settings are considered final, that's why we moved the time of their insertion to right before their usage.
  • Rework handle_discrepancies_at_startup with the aim of making it ready for decentralization:
    • Source its data from the MithrilNetworkConfigurationProvider instead of the aggregator configuration
    • Last change + store change implies that data are now registered for the work epoch window (-1, 0, +1) instead of from -1 to +2
    • Run it later in the dependency injection: after building the ServeCommandDependenciesContainer instead of after building the epoch setting store. This changes limits its call to the serve command, previously other commands could call it even if they did not need it at all.
  • Configuration:
    • protocol_parameters and cardano_transactions_signing_config are now options, but still mandatory for a leader aggregator (the missing configuration error is now handled manually instead of automatically by the config crate)
    • Add optional preload_security_parameter with a default value of 2160. Used to configure the transactions preloader instead of fetching the security parameter in the cardano_transactions_signing_config configuration.
  • Single signature authenticator: log the inner error when the authentication fails, before no context were available in we could not know why an authentication failed
  • Test:
    • strengthen create_certificate_follower integration test by making it check that the follower aggregator works without configured protocol_parameters
    • rework and simplify test tooling attach to the ServeCommandDependenciesContainer:
      • now longer stores data in the epoch_settings table, but only in the signer and signer_registration table. This means that the epoch_settings must have been filled beforehand (either by running handle discrepancies or manually).
      • update usages, notably in the certifier service tests
      • remove now unused methods

Leader aggregator specific:

  • Add LocalMithrilNetworkConfigurationProvider: a MithrilNetworkConfigurationProvider that fetch its data first from the epoch_settings table in the sqlite database, and if an entry is missing for an epoch, it fallback to the usual configuration parameters (protocol_parameters and cardano_transactions_signing_config)

Follower aggregator specific:

  • Use mithril-protocol-config::http::HttpMithrilNetworkConfigurationProvider as its network configuration provider, fetching data from its configured leader aggregator

Mithril-common & openapi

  • EpochSettingsMessage:
    • make signer_registration_protocol field optional so future node won't fails when we remove it
    • deprecate signer_registration_protocol and cardano_transactions_signing_config fields

Mithril-end-to-end

  • Make update_protocol_parameters step leader only. Now it doesn't matter for the follower since it retrieve its configuration from the leader and no longer read its configuration, and keeping it introduced a flakiness since sometimes the follower aggregator restarted before the leader could restart its http server, making the handle discrepancies of the follower fails (since it do a call to the configuration provider).
  • Wrap tailed logs and extracted errors in a ::group:: when running in github action. Disabled for now as there's an remaining issue to tackle first: when the e2e is retry the logs in those groups for the previous iteration are missing in the action output.

Pre-submit checklist

  • Branch
    • Tests are provided (if possible)
    • Crates versions are updated (if relevant)
    • CHANGELOG file is updated (if relevant)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
  • PR
    • All check jobs of the CI have succeeded
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested
  • Documentation
    • Update README file (if relevant)
    • Update documentation website (if relevant)
    • No new TODOs introduced

Comments

There's a know issue for the updated handle_discrepancies_at_startup: it run twice.
This is because the ServeCommandDependenciesContainer is also built twice, once for its purpose, a second time for the http server.
This is harmless as this will record twice the same epoch settings and the second recording will be ignored, but we should probably construct a HttpServerDependenciesContainer instead of reusing the one of the serve command.

Issue(s)

Relates to #2692

@github-actions
Copy link

github-actions bot commented Oct 13, 2025

Test Results

    4 files  ± 0    168 suites  ±0   23m 58s ⏱️ +6s
2 213 tests + 6  2 213 ✅ + 6  0 💤 ±0  0 ❌ ±0 
6 901 runs  +14  6 901 ✅ +14  0 💤 ±0  0 ❌ ±0 

Results for commit 3f771a3. ± Comparison against base commit c7220be.

This pull request removes 3 and adds 9 tests. Note that renamed tests count towards both.
mithril-aggregator ‑ database::query::epoch_settings::update_epoch_settings::tests::test_update_epoch_settings
mithril-aggregator ‑ runtime::runner::tests::test_update_epoch_settings
mithril-aggregator ‑ services::epoch_service::tests::update_epoch_settings_insert_future_epoch_settings_in_the_store
mithril-aggregator ‑ database::query::epoch_settings::insert_or_ignore_epoch_settings::tests::test_cant_replace_existing_value
mithril-aggregator ‑ database::query::epoch_settings::insert_or_ignore_epoch_settings::tests::test_insert_epoch_setting_in_empty_db
mithril-aggregator ‑ database::repository::epoch_settings_store::tests::save_epoch_settings_does_not_replace_existing_value_in_database
mithril-aggregator ‑ services::epoch_service::tests::inform_epoch_compute_allowed_discriminants_from_intersection_of_aggregation_network_config_and_configured_discriminants
mithril-aggregator ‑ services::epoch_service::tests::inform_epoch_insert_registration_epoch_settings_in_the_store
mithril-aggregator ‑ services::network_configuration_provider::tests::get_stored_configuration_with_stored_value_returns_them
mithril-aggregator ‑ services::network_configuration_provider::tests::get_stored_configuration_without_stored_value_fallback_to_configuration_value
mithril-aggregator ‑ services::network_configuration_provider::tests::test_get_network_configuration_retrieve_configurations_for_aggregation_next_aggregation_and_registration
mithril-common ‑ messages::epoch_settings::tests::test_current_json_deserialized_into_message_supported_until_open_api_0_1_55

♻️ This comment has been updated with latest results.

@turmelclem turmelclem force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from 407ea74 to bc2e5d5 Compare October 29, 2025 15:22
@turmelclem turmelclem self-assigned this Oct 29, 2025
@turmelclem turmelclem force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from 120452b to 4e21205 Compare November 3, 2025 16:02
@Alenar Alenar temporarily deployed to testing-preview November 5, 2025 09:01 — with GitHub Actions Inactive
@Alenar Alenar force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from 1b74219 to 890ad0b Compare November 5, 2025 14:58
@Alenar Alenar temporarily deployed to testing-preview November 5, 2025 15:08 — with GitHub Actions Inactive
@Alenar Alenar force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch 4 times, most recently from 07d14a3 to 80abe55 Compare November 6, 2025 11:12
@Alenar Alenar temporarily deployed to testing-preview November 6, 2025 17:20 — with GitHub Actions Inactive
@turmelclem turmelclem force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch 3 times, most recently from ea59e83 to de20459 Compare November 7, 2025 14:16
@Alenar Alenar requested a review from jpraynaud November 10, 2025 10:35
@Alenar Alenar self-assigned this Nov 10, 2025
@Alenar Alenar force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from de20459 to 2097da6 Compare November 10, 2025 11:19
Alenar and others added 3 commits November 10, 2025 13:04
…gning_config optionnal (mandatory for leader)
Rework `init_state_from_fixture` to not save epoch_settings and works
with the fixed window of three epoch (aggregate/next aggregate/signer
registration), epoch settings should already exists, most of the time
they will be inserted by the handle discrepancies system
Alenar and others added 6 commits November 10, 2025 13:04
- run it at the end of the serve dependency container build
- retrieve and save data from the network configuration provider instead
  of the local node configuration
- update follower integration test to check that local protocol
  parameter configuration is not read, instead the configuration is read
  through the network configuration provider from the leader
Since now the follower read the network config from the leader, this
means that the update of the protocol parameters is now a responsability
of the leader only.

This lead to flakiness because this step was restarting all aggregators,
and sometimes the follower started before the leader and had a error
when it executed its handle discrepencies because the leader http server
was still down.
until we can figure out how to make the logs group work correctly when
the e2e is retry by `nick-fields/retry` (currently the group before the retry are lost)
@Alenar Alenar force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from 2097da6 to 6319f63 Compare November 10, 2025 12:07
@Alenar Alenar marked this pull request as ready for review November 10, 2025 12:08
@Alenar Alenar temporarily deployed to testing-preview November 10, 2025 12:17 — with GitHub Actions Inactive
…ction between local configuration and network configuration

For leader aggregator this does not change anything right now since both
value come from the aggregator configuration.
For follower this allow them to use a subset of the signed entity types
allowed in their leader.
@Alenar Alenar temporarily deployed to testing-preview November 10, 2025 16:08 — with GitHub Actions Inactive
… `cardano_transactions_signing_config` from `EpochSettingsMessage`

- mark both fields as deprecated
- make `signer_registration_protocol` optional
…tingsMessage`

This ensures that nodes built with this patch won't fails when thoses
fields will be removed.
@Alenar Alenar temporarily deployed to testing-preview November 12, 2025 10:46 — with GitHub Actions Inactive
@jpraynaud jpraynaud requested a review from Copilot November 12, 2025 17:08
Copilot finished reviewing on behalf of jpraynaud November 12, 2025 17:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements phase 1 of configuration parameter decentralization for the Mithril aggregator by introducing a network configuration provider system that enables follower aggregators to retrieve protocol parameters from the leader instead of requiring local configuration.

Key Changes:

  • Introduced MithrilNetworkConfigurationProvider with two implementations: LocalMithrilNetworkConfigurationProvider for leader aggregators (reads from database with local config fallback) and HttpMithrilNetworkConfigurationProvider for follower aggregators (fetches from leader)
  • Reworked EpochService to use the configuration provider, eliminating the update_epoch_settings method and consolidating epoch settings insertion into inform_epoch
  • Made protocol_parameters and cardano_transactions_signing_config optional in configuration, with manual validation for leader aggregators

Reviewed Changes

Copilot reviewed 47 out of 48 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
openapi.yaml Deprecated signer_registration_protocol and cardano_transactions_signing_config fields in EpochSettingsMessage, making them optional for backward compatibility
mithril-test-lab/mithril-end-to-end/src/utils/formatting.rs Added LogGroup utility for formatting test logs with optional GitHub Actions group support
mithril-test-lab/mithril-end-to-end/src/utils/mithril_command.rs Refactored to use LogGroup for consistent log formatting
mithril-test-lab/mithril-end-to-end/src/end_to_end_spec.rs Made update_protocol_parameters leader-only to prevent follower flakiness
mithril-common/src/messages/epoch_settings.rs Made signer_registration_protocol_parameters optional and deprecated it along with cardano_transactions_signing_config
mithril-common/src/test/double/dummies.rs Updated dummy implementation for optional signer_registration_protocol_parameters
mithril-aggregator/src/services/network_configuration_provider.rs New file implementing LocalMithrilNetworkConfigurationProvider for leader aggregators
mithril-aggregator/src/services/epoch_service.rs Reworked to use MithrilNetworkConfigurationProvider and consolidated epoch settings insertion into inform_epoch
mithril-aggregator/src/services/message.rs Updated to handle optional deprecated fields in EpochSettingsMessage
mithril-aggregator/src/store/epoch_settings_storer.rs Changed handle_discrepancies_at_startup to use MithrilNetworkConfiguration and store settings for work epoch window
mithril-aggregator/src/database/repository/epoch_settings_store.rs Changed from insert or replace to insert or ignore to treat stored epoch settings as final
mithril-aggregator/src/database/query/epoch_settings/*.rs Replaced UpdateEpochSettingsQuery with InsertOrIgnoreEpochSettingsQuery
mithril-aggregator/src/dependency_injection/builder/*.rs Moved handle_discrepancies_at_startup to run after building ServeCommandDependenciesContainer and added network configuration provider building
mithril-aggregator/src/configuration.rs Made protocol_parameters and cardano_transactions_signing_config optional with manual validation for leaders; added preload_security_parameter config
mithril-aggregator/src/runtime/*.rs Removed update_epoch_settings calls from state machine
mithril-aggregator/src/entities/*.rs Removed protocol parameters and signing config from LeaderAggregatorEpochSettings
mithril-aggregator/src/tools/single_signature_authenticator.rs Enhanced error logging to include authentication failure reasons
mithril-aggregator/tests/*.rs Updated integration tests to handle optional protocol parameters and simplified test setup
docs/website/root/manual/develop/nodes/mithril-aggregator.md Added documentation for new preload_security_parameter configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

) -> StdResult<AggregatorEpochSettings> {
Ok(AggregatorEpochSettings {
protocol_parameters: self.protocol_parameters().with_context(
|| "Configuration `protocol_parameter` is mandatory for a Leader Aggregator",
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the configuration field name: the error message refers to protocol_parameter (singular) but the actual field name is protocol_parameters (plural). This should be corrected to match the actual field name for clarity.

Suggested change
|| "Configuration `protocol_parameter` is mandatory for a Leader Aggregator",
|| "Configuration `protocol_parameters` is mandatory for a Leader Aggregator",

Copilot uses AI. Check for mistakes.
Comment on lines 44 to 48
/// call and the epoch service call.
async fn handle_discrepancies_at_startup(
&self,
current_epoch: Epoch,
epoch_settings_configuration: &AggregatorEpochSettings,
network_configuration: &MithrilNetworkConfiguration,
) -> StdResult<()> {
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment refers to "recording epoch window (-1, 0, +1)" but this appears to be incorrect based on the implementation. The code stores settings for three epochs:

  • signer_retrieval_epoch (current epoch - 1)
  • current_epoch (epoch)
  • recording_epoch (current epoch + 1)

This is actually epochs at offsets (-1, 0, +1) relative to the current epoch. The comment should be updated to clarify that these are relative to the network_configuration.epoch parameter, not a "recording epoch".

Copilot uses AI. Check for mistakes.

use crate::database::record::EpochSettingsRecord;

/// Query to update [EpochSettingsRecord] in the sqlite database
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment on line 7 states "Query to update [EpochSettingsRecord]" but this is actually an insert operation (insert or ignore), not an update. The comment should be corrected to "Query to insert [EpochSettingsRecord]" or "Query to insert [EpochSettingsRecord] if it doesn't exist".

Suggested change
/// Query to update [EpochSettingsRecord] in the sqlite database
/// Query to insert [EpochSettingsRecord] into the sqlite database if it doesn't exist

Copilot uses AI. Check for mistakes.
.cardano_transactions
.clone()
.ok_or(anyhow!(
"missing cardano transactions signing config for epoch {epoch}"
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message format is inconsistent with others in the same file. Lines 307-309, 321-323, and 335-337 in epoch_service.rs all use the format "Missing cardano transactions signing config for <description> epoch {:?}" with debug formatting ({:?}), while this line uses "missing cardano transactions signing config for epoch {epoch}" with regular formatting and lowercase "missing". Consider using consistent capitalization and formatting across all related error messages.

Suggested change
"missing cardano transactions signing config for epoch {epoch}"
"Missing cardano transactions signing config for epoch {:?}",
epoch

Copilot uses AI. Check for mistakes.
Comment on lines +132 to +134
///
/// Note: `epoch_settings` store must have data for the inserted epochs, this should be done
/// automatically when building the [ServeCommandDependenciesContainer] by `handle_discrepancies_at_startup`
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The method comment on lines 132-134 mentions that epoch_settings store must be filled beforehand by handle_discrepancies_at_startup, but this introduces a circular dependency in understanding: the test helper depends on handle_discrepancies being run first, but it's not clear from the code structure when/where that happens. Consider adding a reference to where handle_discrepancies_at_startup is called (in DependenciesBuilder::build_serve_dependencies_container) to make the dependency chain clearer.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants